library(tidyverse)
library(ggplot2)
library(lubridate)
library(ggplot2)
library(ggmap)
library(dplyr)
library(data.table)
library(ggrepel)
To import all the data we read in from csv files each year range seperatly and the combine all the data with rbind to create a complete data set.
to2004 <- read.csv("Chicago_Crimes_2001_to_2004.csv",stringsAsFactors=FALSE)
to2007 <- read.csv("Chicago_Crimes_2005_to_2007.csv",stringsAsFactors=FALSE)
to2011 <- read.csv("Chicago_Crimes_2008_to_2011.csv",stringsAsFactors=FALSE)
to2017 <- read.csv("Chicago_Crimes_2012_to_2017.csv",stringsAsFactors=FALSE)
all <- rbind(to2004,to2007,to2011,to2017)
In addition, to plot the data we must get a map of the area of Chicago
map <- get_map(location=c(lon=-87.645167,lat=41.808013), zoom=11, maptype='roadmap', color='bw')#Get the map from Google Maps
After the data has been imported I decided to only focus on the most recent 10 years to decrease processor load on my computer. Since some of the values were imported as strings they must be transformed to numeric values before they can be graphed. Finally, for the purpose of graphing only points with a latitude and longitude can be included so I creted a new dataframe with only points with a Latitude and Longitude.
crimes <- filter(all,Year>2007)
crimes <-filter(crimes,Year<2018)
crimes$Longitude <- as.numeric(crimes$Longitude)
crimes$Latitude <- as.numeric(crimes$Latitude)
hasLocation <- filter(crimes, !is.na(Longitude),!is.na(Latitude))
For my analysis of the Chicago Crime data set I will be focusing on the different distributions of crimes in chicago. Since there were over 30 different crimes, I chose 6 different crimes to analyze. These 6 crimes were chose because of their distrubtion and the popularity of the crime. To best reperesed the distribution of all the crimes I chose to create heatmaps for all instances of each crime in the past 10 years.
singleCrime <- filter(hasLocation, Primary.Type =="ASSAULT")
ggmap(map, extent = "device") + geom_density2d(data = singleCrime, aes(x = Longitude, y = Latitude), size = 0.3) +
stat_density2d(data = singleCrime,
aes(x = Longitude, y = Latitude, fill = ..level.., alpha = ..level..), size = 0.01,
bins = 50, geom = "polygon") + scale_fill_gradient(low = "green", high = "red") +
scale_alpha(range = c(0, 0.3), guide = FALSE)
singleCrime <- filter(hasLocation, Primary.Type =="GAMBLING")
ggmap(map, extent = "device") + geom_density2d(data = singleCrime, aes(x = Longitude, y = Latitude), size = 0.3) +
stat_density2d(data = singleCrime,
aes(x = Longitude, y = Latitude, fill = ..level.., alpha = ..level..), size = 0.01,
bins = 50, geom = "polygon") + scale_fill_gradient(low = "green", high = "red") +
scale_alpha(range = c(0, 0.3), guide = FALSE)
singleCrime <- filter(hasLocation, Primary.Type =="KIDNAPPING")
ggmap(map, extent = "device") + geom_density2d(data = singleCrime, aes(x = Longitude, y = Latitude), size = 0.3) +
stat_density2d(data = singleCrime,
aes(x = Longitude, y = Latitude, fill = ..level.., alpha = ..level..), size = 0.01,
bins = 50, geom = "polygon") + scale_fill_gradient(low = "green", high = "red") +
scale_alpha(range = c(0, 0.3), guide = FALSE)
singleCrime <- filter(hasLocation, Primary.Type =="BURGLARY")
ggmap(map, extent = "device") + geom_density2d(data = singleCrime, aes(x = Longitude, y = Latitude), size = 0.3) +
stat_density2d(data = singleCrime,
aes(x = Longitude, y = Latitude, fill = ..level.., alpha = ..level..), size = 0.01,
bins = 50, geom = "polygon") + scale_fill_gradient(low = "green", high = "red") +
scale_alpha(range = c(0, 0.3), guide = FALSE)
singleCrime <- filter(hasLocation, Primary.Type =="HOMICIDE")
ggmap(map, extent = "device") + geom_density2d(data = singleCrime, aes(x = Longitude, y = Latitude), size = 0.3) +
stat_density2d(data = singleCrime,
aes(x = Longitude, y = Latitude, fill = ..level.., alpha = ..level..), size = 0.01,
bins = 50, geom = "polygon") + scale_fill_gradient(low = "green", high = "red") +
scale_alpha(range = c(0, 0.3), guide = FALSE)
singleCrime <- filter(hasLocation, Primary.Type =="NARCOTICS")
ggmap(map, extent = "device") + geom_density2d(data = singleCrime, aes(x = Longitude, y = Latitude), size = 0.3) +
stat_density2d(data = singleCrime,
aes(x = Longitude, y = Latitude, fill = ..level.., alpha = ..level..), size = 0.01,
bins = 50, geom = "polygon") + scale_fill_gradient(low = "green", high = "red") +
scale_alpha(range = c(0, 0.3), guide = FALSE)